Classification Learning: From Paradigm Conflicts to Engineering Choices
نویسنده
چکیده
Classification learning applies to a wide range of tasks, from diagnosis and troubleshooting to pattern recognition and keyword assignment. Many methods have been used to build classification systems, including artificial neural networks, rule-based expert systems (both hand-built and inductively learned), fuzzy rule systems, memorybased and case-based systems and nearest neighbor systems, generalized radial basis functions, classifier systems, and others. Research subcommunities have tended to specialize in one or another of these mechanisms, and many papers have argued for the superiority of one methods vis-a-vis others. I will argue that none of these methods is universal, nor does any one method have a priori superiority over all others. To support this argument, I show that all these methods are related, and in fact can be viewed as lying at points along a continuous spectrum, with memory-based methods occupying a pivotal position. I further argue that the selection of one or another of these methods should generally be seen as an engineering choice, even when the research goal is to explore the potential of some method for explaining aspects of cognition; methods and problem areas must be considered together. Finally a set of properues is identified that can be used to characterize each of the classification methods, and to begin to build an engineering science for classification tasks. 1.0 Unified Framework for Classification Learning A wide variety of classification learning methods can be seen as related, as points on a spectrum of methods. Memory-based Reasoning (MBR) is the key to this analysis. The idea of MBR is to use a training set without modification as the basis of a nearest neighbor classification method. Any new example to be classified is compared to each element in the training set and the distance from the new example is computed for each training set element. The nearest neighbor (or nearest k neighbors) are found in the training set, and their classifications used to decide on the classification for the new example. In a single nearest neighbor version of MBR, the class of the closest training set neighbor is assigned to the new example. In a k-nearest neighbor version, if all k nearest neighbors have the same class, it is assigned to the new example; if more than one class appears within the nearest k neighbors, then a voting or distance-weighted voting scheme is used to classify the new example. As stated, MBR has no learning. (It is certainly possible -and for real world problems generally a good idea -to include learning with MBR; we will come back to this issue later.) First, we can relate MBR to rule-based systems; in particular, if looked at the right way, a single-nearest-neighbor MBR system is already a rule-based system. To see this, note that MBR cases consist of situtations and actions, like production rules. There are as many "rules" as there are cases in the MBR training set database. Each "left hand side" is the conjunction of all the features of the case. Each "right hand side" is the classification. Using this observation, we can see that there is a spectrum of rule-based systems between MBR and an "ordinary" rule-based system, with a relatively small number of rules. We can move along this spectrum by using AI learning techniques: for example, we can find irrelevant features by noting that certain left-hand side variables have no correlation with classifications, and can thus be eliminated, yielding shorter rules. Also, some cases may be repeated, and as variables are eliminated, more cases will become identical, and can 128 From: AAAI Technical Report SS-93-07. Compilation copyright © 1993, AAAI (www.aaai.org). All rights reserved.
منابع مشابه
Deep Unsupervised Domain Adaptation for Image Classification via Low Rank Representation Learning
Domain adaptation is a powerful technique given a wide amount of labeled data from similar attributes in different domains. In real-world applications, there is a huge number of data but almost more of them are unlabeled. It is effective in image classification where it is expensive and time-consuming to obtain adequate label data. We propose a novel method named DALRRL, which consists of deep ...
متن کاملIntroduction to the teachings of the transcendental paradigm in the process of teaching-learning and its critique
The purpose of this study is to study the teachings of the transcendental paradigm in the process of teaching-learning and its critique. In order to achieve the purpose of the research, three methods of conceptual, inference and critical analysis have been used to analyze and critique the foreman paradigm. Findings of the research indicate that meta-text instead of oral text emphasizes written ...
متن کاملSparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains
In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...
متن کاملOptimizing the Grade Classification Model of Mineralized Zones Using a Learning Method Based on Harmony Search Algorithm
The classification of mineralized areas into different groups based on mineral grade and prospectivity is a practical problem in the area of optimal risk, time, and cost management of exploration projects. The purpose of this paper was to present a new approach for optimizing the grade classification model of an orebody. That is to say, through hybridizing machine learning with a metaheuristic ...
متن کاملTowards a Formal Detection of Semantic Conflicts Between Aspects: A Model-Based Approach
Aspect-Oriented Programming is a new promising software engineering paradigm. Aspects are well adapted to capture crosscutting concerns. The new mechanisms introduced by this paradigm allow weaving aspects with different join points in a program. Unfortunately, this flexibility can lead to many unsuspected conflicts between aspects. Moreover, the existing aspect-oriented tools do not detect the...
متن کاملFault Detection of Anti-friction Bearing using Ensemble Machine Learning Methods
Anti-Friction Bearing (AFB) is a very important machine component and its unscheduled failure leads to cause of malfunction in wide range of rotating machinery which results in unexpected downtime and economic loss. In this paper, ensemble machine learning techniques are demonstrated for the detection of different AFB faults. Initially, statistical features were extracted from temporal vibratio...
متن کامل